Detecting Promotional Content in Wikipedia

نویسندگان

  • Shruti Bhosale
  • Heath Vinicombe
  • Raymond J. Mooney
چکیده

This paper presents an approach for detecting promotional content in Wikipedia. By incorporating stylometric features, including features based on n-gram and PCFG language models, we demonstrate improved accuracy at identifying promotional articles, compared to using only lexical information and metafeatures.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards Detecting Wikipedia Task Contexts

Wikipedia is a resource used by many people for many different purposes. We posit that it might be beneficial to alter the content or the way content is presented depending on the task context. Here we describe a small pilot lab study to investigate features of interaction that might help to infer the contextual situation surrounding wikipedia search tasks. We describe our effort to collect dat...

متن کامل

Using Language Models to Detect Wikipedia Vandalism

This paper explores a statistical language modeling approach for detecting Wikipedia vandalism. Wikipedia is a popular and influential collaborative information system. The collaborative nature of authoring, as well as the high visibility of its content, have exposed Wikipedia articles to vandalism, defined as malicious editing intended to compromise the integrity of the content of articles. Ex...

متن کامل

The Workshops of the Tenth International AAAI Conference on Web and Social Media

Event detection in social media usually exploits information from social-networking platforms, such as Twitter or Facebook. However, previous research has suggested that Wikipedia constitutes a valuable source of information for the task of detecting breaking news. In this work we adapt a graph-based algorithm to the Wikipedia context, and compare it to the state-of-the-art Wikipedia real-time ...

متن کامل

Detecting Controversial Articles in Wikipedia

In this paper, we apply graphical models to facilitate quantitative and qualitative investigations into the edit history of articles posted on Wikipedia. Quantitatively, we use the models to measure controversy arising from Wikipedia articles. Qualitatively, we use the models to provide insights into the distribution of editor roles associated with articles. The paper includes exercises that ca...

متن کامل

Identifying , Understanding and Detecting Recurring , Harmful Behavior Patterns in Collaborative Wikipedia Editing – Doctoral Proposal – Fabian

In this doctoral proposal, we describe an approach to identify recurring, collective behavioral mechanisms in the collaborative interactions of Wikipedia editors that have the potential to undermine the ideals of quality, neutrality and completeness of article content. We outline how we plan to parametrize these patterns in order to understand their emergence and evolution and measure their eff...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013